Skip to content

fix: #10 - Improve crawling of initially redirected requests #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

JJetmar
Copy link
Contributor

@JJetmar JJetmar commented Aug 11, 2025

This commit introduces a new feature to limit crawling to links from the same domain.

It updates the base Docker image to Node 22.

The code adds a new input field to the schema, allowing users to specify whether only links from the same domain should be enqueued. The crawling logic is updated to respect this setting, skipping links that point to different domains. The commit also ensures that the initial domain is passed to subsequent requests when this option is enabled and fixes an issue where the crawler would not crawl initially redirected requests.

@JJetmar
Copy link
Contributor Author

JJetmar commented Aug 11, 2025

cc @metalwarrior665 @vladfrangu

@metalwarrior665
Copy link
Contributor

Thanks

@metalwarrior665 metalwarrior665 merged commit a50aebd into apify-projects:master Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants